Improving accuracy of Winograd convolution for DNNs

نویسندگان

  • Barbara Barabasz
  • Andrew Anderson
  • David Gregg
چکیده

Modern deep neural networks (DNNs) spend a large amount of their execution time computing convolutions. Winograd's minimal algorithm for small convolutions can greatly reduce the number of arithmetic operations. However, a large reduction in floating point (FP) operations in these algorithms can result in significantly reduced FP accuracy of the result. In this paper we propose several methods for reducing the FP error of these algorithms. Minimal convolution algorithms depend on the selection of several numeric \textit{points} that have a large impact on the accuracy of the result. Some points are known to be better than others, but there is no systematic method selecting points for small convolutions. We show that there are a relatively small number of important cases for DNN convolution, that can be searched empirically. We compared both standard and modified versions of the Winograd algorithm. Further, we demonstrate that both the ordering and value of the points is important, and we propose a canonical evaluation ordering that both reduces FP error and the size of the search space based on Huffman coding. We find that good point selections depend on the values of the points themselves and on symmetries between different points. We show that sets of points with symmetric groups give better results. In addition, we explore other methods to reduce FP error, including mixed-precision convolution, and pairwise addition across DNN channels. Using our methods we can significantly reduce FP error for a given Winograd convolution block size, which allows larger block sizes and reduced computation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Enabling Sparse Winograd Convolution by Native Pruning

Sparse methods and the use of Winograd convolutions are two orthogonal approaches, each of which significantly accelerates convolution computations in modern CNNs. Sparse Winograd merges these two and thus has the potential to offer a combined performance benefit. Nevertheless, training convolution layers so that the resulting Winograd kernels are sparse has not hitherto been very successful. B...

متن کامل

Pruning of Winograd and FFT Based Convolution Algorithm

Winogradand FFT-based convolution are two efficient convolution algorithms targeting high-performance inference. Their efficiency comes from the reduction of the number of multiplication operations due to linear and Fourier transforms. However, the two existing approaches cannot handle efficient compression of the neural network, which might contribute significant improvement in computation and...

متن کامل

MEC: Memory-efficient Convolution for Deep Neural Network

Convolution is a critical component in modern deep neural networks, thus several algorithms for convolution have been developed. Direct convolution is simple but suffers from poor performance. As an alternative, multiple indirect methods have been proposed including im2colbased convolution, FFT-based convolution, or Winograd-based algorithm. However, all these indirect methods have high memory-...

متن کامل

Winograd ' s Short DFT Algorithms ∗

In 1976, S. Winograd [20] presented a new DFT algorithm which had signi cantly fewer multiplications than the Cooley-Tukey FFT which had been published eleven years earlier. This new Winograd Fourier Transform Algorithm (WFTA) is based on the typeone index map from Multidimensional Index Mapping with each of the relatively prime length short DFT's calculated by very e cient special algorithms. ...

متن کامل

Efficient Sparse-Winograd Convolutional Neural Networks

Convolutional Neural Networks (CNNs) are compute intensive which limits their application on mobile devices. Their energy is dominated by the number of multiplies needed to perform the convolutions. Winograd’s minimal filtering algorithm (Lavin (2015)) and network pruning (Han et al. (2015)) reduce the operation count. Unfortunately, these two methods cannot be combined — because applying the W...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2018